LPU Programming
Projections and functions Projections are conceptual relationships between dimensions. When a projection links two dimensions, then all artefacts in the source dimension have an image in the target dimension. Note that it happens that two source artefacts share the same image (but not the other way around!). When an artefact is added to a dimension, the LPU automatically creates the artefact images in the connected dimensions and establish the links between the artefacts accordingly. The LPU needs to know how to compute the artefact images, otherwise the system would be rather useless. Indeed, one would only know that a given artefact has a projection in a given dimension, but not the content of the image :-( Thus the LPU implements a number of functions that may be used in order to compute the content of the image artefacts. Obviously, with a limited number of functions, only a limited number of linguistic processing can be done. This is why the LPU allows to define new functions. To do so, the LPU allows linguistic programmers to upload their own functions and use them to compute projections. These functions are called library functions since they may be used and reused for many projections. Indeed, once a library function has been uploaded in a LPU server, it may be used in all projects hosted in that server. As a summary, a projection is a link between dimensions; and each projection is computed by a library function. It is important to note that the concept of library function is an operational concept, it is not part of the linguistic dimension-projection model of the LPU. Properties of library functions Like all other LPU objects, library functions are described using a number of properties; Basic function properties The following properties are mandatory for the description of a function. *name: a unique name for identifying the function; *language: the programming language used to implement the function (see below); *inputs: the type of data that the function uses; *returns: what kind of data will the function return; *usage: for what usage the function may be used (see below); *program: the source code of the function. The LPU will make a number of controls when a new function is created: #the name is unique; #the programming language is one of the languages known by the LPU; #inputs and outputs are know data types; #usage is one of the known possible usages for functions; #the source program follows the syntactic constraints of the selected language; Function usage The above properties introduced the concept of "function usage". Indeed, the LPU does not only allow to upload functions to compute projections, but also to compute aggregations. Since the same base mechanism is used, one has to tell the LPU if the new function will be used for one or the other usage. See Programming projections and Programming aggregations. Function inputs and outputs Input and output types of functions indicate to the LPU if the function manipulates numbers, text or vectors. The input and output types are the same as for the artefact data types and the LPU will control that, when a projection uses a function, then the I/O data types of the function match the artefacts types of the source and target dimension. See Artefact data types. Function source code The source code of the function must be provided in one of the languages implemented in the LPU. See LPU Programming Languages . Advanced function properties #arguments: some functions may be paramterized; for example: a pattern matching function has one parameter: the pattern. See Function and projection parameters . #requirements: some functions require that the LPU implements some specific features; it is the case with functions that call third party software. Since the function may obviously not include that software, it requires that the software is accessible from the LPU server. See Integrating third party software. #constraints: some functions may not be computed in any order; for example a multi-language POS tag function may requires that the language of the source text has already been identified. Otherwise it won't know which algorithm must be used. The LPU is able to take such constraints into account, provided that they are declared once for all. See Function and projection constraints .